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Abstract 

This paper presents a conceptual architectural design of a four-channel Orthogonal Frequency' 
Division Multiplexing (OFDM) system with an aggregate information throughput of 622 
megabits per second (Mbps). Primary emphasis is placed on the generation and detection of 
the composite waveform using polyphase filter and Discrete Fourier Transform (DFT) 
approaches to digitally stack and bandlimit the individual carriers. The four-channel approach 
enables the implementation of a system that can be both power and bandwidth efficient, yet 
enough parallelism exists to meet higher data rate goals. It also enables a DC power efficient 
transmitter that is suitable for on-board satellite systems, and a moderately complex receiver 
that is suitable for low-cost ground terminals. The major advantage of the system as compared 
to a single channel system is lower complexity and DC power consumption. This is because the 
highest sample rate is x /i that of the single channel system and synchronization can occur at 
most, depending on the synchronization technique, V4 the rate of a single channel system. The 
major disadvantage is the increased peak-to-average power ratio over the single channel 
system. Simulation results in a form of bit-error-rate (BER) curves are presented in this paper. 

Introduction 

A number of proposed broadband satellite communications systems feature rates at or in excess 
of 622 Mbps per downlink with spectrum allocations generally being less than 500 MHz. This 
requires the application of bandwidth and power efficient transmission techniques. Number of 
approaches to implementing such techniques includes analog, digital, mixed signal systems, 
single channel, or multi-channel. In general, the digital implementations offer more advantages. 
However, fully digital implementation at data rates in excess of 622 Mbps is difficult due to the 
high clock speeds that are required. For example, an uncoded 16-ary Quadrature Amplitude 
Modulation (16QAM) system requires a symbol rate of about 156 Msymbols/ sec to attain a 
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throughput of 622 Mbps. If four samples per symbol are used to generate and reconstruct the 
waveform, the sample rate is 622 Msamples/sec. 

In this paper, we examine the use of multichannel techniques as another way of reducing the 
sample rate. One such technique, Multi-Carrier Modulation (MCM) [7], divides the data into a 
number of low rate channels that are stacked in frequency and separated by 1 /symbol rate. 
MCM, sometimes also called OFDM, is being proposed for numerous systems including mobile 
wireless and digital subscriber link systems. 

OFDM System Overview 

The basic OFDM waveform in this paper is constructed by dividing an incoming data stream 
into N=4 channels, each channel using Offset-16QAM. Each input channel symbol is denoted 
Xi, i=0,l,2,7 and is actually a complex number that is written as 
= o, + jb t , i = 0,1, 2, 7 

The reason the fourth channel is labeled 7 will become clear in the discrete time system. The 
time domain waveform of each Offset- 16QAM channel is thus written as 

x j ( t ) = a t (t) + jb i (t — T / 2) , i — 0, 1,2,7 where T is the symbol rate. Each complex channel is 
then filtered to limit the bandwidth of each channel. 


This is written as 

x j (t)*h(t') where h(t) is the filter function and the * denotes convolution. 


Each complex channel is then stacked in frequency with carriers that are separated in frequency 
by 1/T and are separated in phase by nil. The complex carriers can thus be written as 


c,.(/) = exp j\ 


In n\ 
— t + — / 
T 2) 


, i = 0,1, 2,7 


Each channel's modulated waveform is then 

m i (t) = (x, (t) * h(t)\c j (t) , i = 0,1, 2, 7 and the overall transmitted waveform is the summation 
of all channels. 

‘ I 2n . , K 


m(t)= m, (?)= X (( a i(0+A(t))* h(t))exp 


/= 0 , 1 , 2,7 


i= 0,1,2, 7 


j\ H— 

\ T 2 


Pictorially, assuming a spectral shape for h(t), the frequency response of the composite 
waveform is shown below. 
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At the receiver side the composite waveform is split into four channels that are each down 

converted by 

—i ^ ( In n A ^ 

(c, (/)) = exp - j — t + - k , / = 0,1, 2, 7 

V 1 2 ) ) 

Each baseband waveform is then passed through the matched filtered to h(t), denoted g(t). The 
matched filtered data is an estimate of the transmitted data and is denoted 

x, (/) = a, (/) + jb, (/ - Tt 2) , i = 0, 1,2,7 

The structure is illustrated below. 



The conditions for zero intersymbol interference (ISI) and interchannel interference (ICI) as 
related to the filters h(t) and g(t) are derived in [8]. They are written as 
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It is well known that an efficient implementation of this type of architecture can be achieved 
using the combination of a DFT and Inverse DFT (IDFT) at the receiver to perform the 
frequency translation and a polyphase filter to facilitate the pulse shaping. In the four-channel 
system, a 4-point complex DFT is required to accommodate the four Offset-16QAM channels. 
However, without considering the use of interpolating filters to increase the sample rate, the 
practical size is the 8-point complex DFT, which allows the rejection of aliases, when converting 
the signal to the analog domain. The discrete time version of the modulated waveform can be 
written as 

jc[n] = x t ( nT s ) = a i ( nT s ) + jb i {nT s - 772) , / = 0,1, 2, 7 where T s is the sample time and n is the 
sample number. 

If T s is set to 1/8 of the symbol time, the equation becomes 
*,[«] = Xi(nT s ) = a,(nT s ) + jb i ( nT s -4 T s ),i = 0,1, 2, 7 

Similarly, the convolution with the filter can be expanded to 

/-i 

s t [n] = Sj(nT s ) = Xi (nT s )* h(nT s ) = ^x(t)h(nT s - r) = 

r=0 

Z[«,(T) + jb, (T -4T s ))i(nT s - T ) , i = 0,1, 2, 7 

T=0 

and 

r ,_i 

^n] = s(nT s )= ]►>>]= X YXa l (x) + jb i {x-4T s )]h(nT s -'C) 

i=0,l,2, 7 (=0,1, 2, 7 L*-0 

Thus, the transmitted signal is 
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m[n]=m(nT s )= ^ s,[«]exp 


|=0,I,2,7 


f , 

(n n \ 

\ 

j 

— n+— 

i 


4 2 



K ) 

> 


= £ 5 /["] eX P -> 


( K 

\J T 


{ 


/ = 0 , 1 , 2, 7 


\ 4 


m 


exp 


V 


.n . 

j r 


In general, the 8-point complex DFT of a signal is written as 


O* exp (j^ki),k = 0,1,2.. .7 

1=0 4 

. K . 

Thus if we set 0 = s, [«] exp(y — j) , / = 0,1,2. . .7 
We have w[«] =O a 


and thus the waveform can be generated using an 8-point complex DFT. That is, each eight 
complex inputs (si [n]=0 for i= 3, 4, 5, 6) will generate 8 complex outputs. These outputs are 
then multiplexed in time to obtain the waveform m[n]. ' Switching real and imaginary 

components, and negating if necessary can accommodate the n i 2 phase shifts. 

This direct implementation requires filtering each channel at the sample rate and performing 
the DFT at the sample rate. It is also well known that the amount of computations can be 
greatly reduced by moving the filter function after the DFT and distributing it among the 
channels. In most systems this results in an N point complex DFT running at the symbol rate 
and N polyphase filters with L/N taps each running at the symbol rate (L is the total number of 
taps). However, in our system we use Offset- 16Q AM to maintain the orthogonal nature of each 
channel. Imparting one half- symbol delay into the filters on the imaginary channels facilitates 
this. This one half symbol delay precludes the movement of the filters to after the DFT. To 
circumvent this difficulty two DFTs are used. The first DFT processes the real components of 
the four channels (with the appropriate tt / 2 phase shifts) and the second DFT processes the 
imaginary components (with the appropriate n ! 2 phase shifts). The results of each DFT are 
sent through their appropriate polyphase filters and multiplexed. Finally, the multiplexed 
imaginary results are delayed by four samples (one half of a symbol) and added to the 
multiplexed real results. A similar operation occurs at the receiver. In the next two sections, we 
show that this structure offers the potential for complexity reduction at both the transmitter 
and receiver, making it suitable for high-speed implementation. 


Modulation and DFT Approach 

The amplitude structure in 16QAM can be used to reduce the complexity of transmit pulse 
shaping filters. Similar principles are applied to the DFT and polyphase filters. First, each 
unique In-phase (I) and Quadrature (Q) modulation level is assigned a 2-bit label representing 
each of the 4 possible amplitude levels in a 16QAM constellation. A 16QAM constellation with 
amplitude levels 

Re{x>]} = «,[«] € {- 3.0, - 1.0, 1.0, 3.0} , i = 0,1, 2, 7 
ylm{.x,.[«]} = jb,[n] e {- 3.0 j, - 1.0 j, 1.0 j, 3.0y} , i = 0,1, 2, 7 


can be represented as two bit labels as 


NASA7TM— 2001-210813 


5 



A jin] e {OO, 01, 10, 1 1} , i = 0, 1,2,7 
5, [n]e {00,01,10,ll},/ = 0,1,2, 7 


where Ai[n] is the label for the I value and B,[n] is the label for the Q value. Labels from four 
channels (i=0,l,2,7) can be used to form an 8-tuple that represent all possible inputs to the real- 
input DFT. Similarly, an 8-tuple is generated to represent all possible inputs to the imaginary- 
input DFT. Recall also that the input to the DFT needs to be rotated by a multiple of n/2 to 
maintain orthogonality. That is 


/,[/?] exp^y^/ 3.0 - 1 .0,1 ,0,3.0}exp J 


71 


y-/ / = 0, 1,2,7 


y6,[n]ex] 


P (4) 


{- 3.oy 1 .oy,i ,oy,3.oy]exp| 


( 


Tt 


j—i U = 0,1, 2, 7 


Thus the labels for the different channels actually represent different amplitude values. The 8- 
tuples are formed as a concatenation of their 2 bit labels as follows 

A[n] = A Q [n]A t [n]A 2 [n]A 7 [n] e {OOOOOOOO, 00000001, 00000010, ..., 1 1 1 1 1 1 1 1} 

B[n] = 5 0 [«]5,[«]5 2 [/j]5 7 [«] € {OOOOOOOO, 00000001, 00000010, ..., 1 1 1 1 1 1 1 1} 

In our design, we use 4 of the modulation channels. Only needing 2 bits to describe each 
channel's in phase or quadrature data, we can use 8-bit blocks of data for processing Q and I 
separately. This knowledge of the data to be transmitted is the base for the design of the digital 
transmitter. 


Transmit DFT Design 

An N point DFT is written as 

N—\ 

X(k) = X x(n)W? , £ = 0,1,2... AT- 1 

n = 0 

. ( In i 

where W N = expl — j —— kn 


If N=8, then 
7 

X(.k) = ^x{n)W*\k = 0,\, 2...1 

n = 0 


? 


which can be written in matrix form 
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X(k) = 


111 1111 

* 2* 3* 4* 5 * 6* 


n 

2n 

3 n 

An 

5* 

6* 

in 


e ' j T 






2* 

An 

6 n 

S n 

10 n 

12* 

14* 


e' ] ~ 



i ^ 



3 * 

6 n 

9 n 

12* 

15* 

18* 

21* 

e 1 ” 

e ~ J T 

e~ J ~* 





An 

8* 

12 n 

16* 

20* 

24* 

28* 

i ^ 


e' J ~ 



e-’~ 


5n 

10 K 

15 n 

20 n 

25* 

30* 

35* 

e ~ J T 

e- j ~ 

e - J ~ 

i 1 - 



e-'~ 

6 n 

12 n 

18 n 

24* 

30* 

36* 

42* 

e' J ~ 

t 

e~ J ~ 





In 

\An 

21 n 

28* 

35* 

42* 

49* 

e- J '~ 

i 1 - 

i s ~ 


7“ 




A(0) 

a-(1) 

a(2) 

a(3) 

a(4) 

a(5) 

v(6) 

a(7) 


Since the four-channel architecture only uses channels 0,1,2, and 7, this equation reduces to 


1 

1 

1 

1 


X(k) = 


1 

1 

1 

1 


1 

1 

1 

* 

Its 

In 

1 

^ 1 


e J 4 

2* 

An 

14* 


e' J ~ 

e -‘~ 

3* 

6* 

21* 



1 ^ 

1 

CD 

An 

8* 

28* 

I 

S-,, 
-U | 

1 Tf 

l 

1 ** 
I 

5* 

10* 

35* 


1 

■~^i 

I 


6* 

12* 

42* 


1 Tf 

l 


In 

14* 

49* 


-^’7" 

p 4 



a(0) 

a(1) 

a(2) 

a(7) 


However, each one of the x(n) values is a complex number, thus 


X(k) = 


1 

l 

l 

1 


* 

2* 

In 

1 

7 4 


e - J ~ 


2* 

4* 

14* 

1 





3* 

6* 

21* 

1 



e _/ “ 


An 

8* 

28* 

1 

i ^ 

e~ } ~* 

1 TJ " 
■*^v 

1 


5* 

10* 

35* 

1 


1 

1 



6* 

12* 

42* 

1 


1 ^ 
1 



In 

14* 

49* 

1 

i 




a(0) + jb(0) 
a(\) + jb(\) 
a(2) + jb(2) 
<7) + jb(T) 


To maintain the orthogonality of the eventual pulse shaped waveforms, we process the real and 
imaginary channels by separate 8-point DFTs. Each of them is modulated through a DFT 
computation that outputs a label for each resulting complex component contribution to the 
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modulated signal. A bank of polyphase filters takes each one of these outputs and translates the 
labels into 8-bit amplitude words. These are finally summed and the result is sent as the 
modulated signal. 

IDFT Approach at the Receiver 

On the receiver side, we will be able to separate the data's real and imaginary components. We 
will then need to implement and efficient, straight computation of the IDFT. The Small-N 
(N=8) DFT algorithm equations are an adaptation from those found on the handbook of Digital 
Signal Processing [1]. The IDFT are computed by the following method: 

IDFT by means of DFT: (1 / N)X * (n) where X(n) is the DFT function and the * denotes a 
conjugate operation. 

The equations are expanded for an 8-point DFT for our specific case of separate real and 
imaginary signal components. The result was a dual set of equations for the outputs of interest. 
We take advantage of the LUT approach possible in FPGA devices by cutting down the 
multiply operations. Since the amplitude data will have a finite length of 8 bits, we are able to 
predict all possible amplitudes. Then instead of a traditional multiply operation, we will "look- 
up" the result of the multiplication. On the receiving end we recuperate the original modulated 
data for each of the four channels through an IDFT computation. This data is then passed on to 
decoding and demodulation. 

Polyphase Filters Design 

In most OFDM systems, the modulated data is left unshaped with each channel having a 
response that falls off as sin (T s )/T s . When the number of channels is large, this does not 
adversely affect the overall system bandwidth efficiency. However, in this system where the 
number of channels is only four, unshaped modulated data caused excessive bandwidth use. 
To alleviate this situation shaping filters can be applied to limit the overall bandwidth. 
However, this diversion from sine functions in the frequency domain must be applied carefully 
to limit intersymbol interference and maintain adjacent channel orthogonality. 

From a hardware architectural view, the number of taps in the overall filter should be kept 
small. Since the added complexity of interpolation filters is not warranted, the number of 
samples per symbol is defined from the DFT size; in this case it is 8. Furthermore, the number 
of symbols that the filter is defined over is set to two. This was chosen to limit the size of each 
polyphase filter to two taps, which greatly eases the implementation complexity in FPGA. Thus 
the overall filter is a 16-tap filter. In addition to the 16 taps constraint, the filter must also limit 
ISI and ICI. Generally speaking, the goals of shaping filter designs are to limit the bandwidth 
without causing ISI using a limited number of taps. The root raised cosine (RRC) pulse is a 
good candidate, but is not optimum. To get a truly optimum filter, one has to attempt to design 
a filter that is both finite in time and in frequency, fundamentally an impossible task! To find 
the coefficients of the filter, we first started with a truncated square root raised cosine (SRRC) 
function [2]. 

The transmit polyphase filter uses the amplitude labels output from the DFT LUTs as input 
addresses to groups of 16X8 LUTs that perform the coefficient multiply operations. There are 
fundamentally two types of polyphase filter elements. The first is called the Small Polyphase 
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Unit (SPU) shown in Figure 1, and the second, the Large Polyphase Unit (LPU) shown in Figure 
2. Each SPU use one 4-bit DFT output label and each LPU use one 5-bit DFT output label. 



Figure 1- Small Polyphase Unit 



Figure 2- Large Polyphase Unit 


The SPU is a realization of a two-tap FIR filter. Since there are only 16 possible output levels 
from the appropriate DFT bin, there are only 16 possible results at the output of each coefficient 
multiply. Thus a 16X8 LUT is used as the coefficients multiply operation. The adder and delay 
element perform the same function as in a conventional FIR filter. The LPUs are slightly more 
complex since it has a 5 bit input. The 5 bits require that there be two 16X8 LUT for each tap 
with the fifth bit selecting which output to use. Other than that, it is equivalent to the SPU. 

At the receiver, it is not possible to use amplitude labels since the incoming data from the ADC 
is a noise-corrupted version of the transmitted data in which the amplitude levels carry 
important information. Thus, the receive polyphase filter must operate with data quantized to 
the Analog-to-Digital Converter (ADC) width, in this case 8 bits. To decrease the 
implementation complexity, the 8-bit fixed-point coefficient multiply operations are replaced 
by a canonical-signed-digit (CSD) representation that reduces to multiplies to a limited number 
a fixed shifts and additions or subtractions [3]. The CSD coefficients are found by starting with 
the floating point representation and a specification of number of quantization levels and the 
number of nonzero elements allowed per coefficient. There are a number of methods available 
in the literature [4] to search for the best set of CSD coefficients. In our case, we fix the number 
of quantization levels to 256 and limited the number of nonzero digits in the CSD 
representation to 1 and 2. Furthermore, we allow one additional nonzero digit for coefficients 
larger than some value s as in [3]. An algorithm then steps through gain factors from 0.5 to 1.0 
with a predefined step size. At each gain factor the algorithm finds the closest CSD 
representation for each coefficient. The mean square error for this set of coefficients is then 
determined and compared to the mean squared error for the previous gain factor and the better 
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set of coefficients is kept. At the end, the best set of coefficients is kept. The quality of the 
coefficients is determined by examining the spectral response of the CSD filter as compared to 
the response of the floating-point version, and by inserting the CSD filter into a time domain 
simulation that determines the BER using a semi-analytic approach. 

System Modeling and Simulations Results 

An OFDM system model is being developed using the Signal Processing Workstation tools. 
The basic OFDM waveform is constructed by dividing an incoming data stream into four 
channels. The baseline rate 7/8 16QAM Four Dimensional Pragmatic Trellis Coded Modulation 
(4D-PTCM) scheme [6] with a Reed-Solomon (RS) (255,239) is being developed for each of the 
four channels. In addition to the baseline rate 7/8 16QAM, the trellis encoder also supports 
rate 5/6 8-PSK and rate 3/4 16QAM. After trellis encoding, the bits are mapped into 
modulation symbols represented by I- and Q-amplitude levels [5]. The bit to symbol mapping 
is chosen in accordance with the encoding scheme to obtain the full benefit of TCM. We then 
process each channel's modulated waveform through a DFT computation that produces a label 
for each resulting complex component contribution to the modulated signal. The polyphase 
filters takes these outputs and translates the labels into 8-bit amplitude words. These are finally 
summed and the result is sent as the modulated signal. At the receiving end, since the 
incoming data from the ADC is severely corrupted by noise, it is impossible to use amplitude 
labels. Therefore, the receive polyphase filters (CSD) must take the data quantized to the ADC 
width and process. We recover the original modulated data for each of the four channels 
through an IDFT computation. The data is then passed on to decoding and demodulation. 

The simulation model contains four channels (each with encoder and modulator), DTF block. 
Polyphase Filters at the transmitter; and polyphase filter (CSD), IDFT block and four channels 
(each with decoder and demodulator) at the receiver. A pseudo-random number generator is 
used to produce binary signal sequences. An Additive White Gaussian Noise (AWGN) source 
of zero mean and power spectral density N/2 is used to add channel noise to the system. The 
Bit-Error-Rate (BER) performance in the AWGN channel is evaluated and the BER plots of 
various schemes are shown in Figure 4. 

Conclusion 

An OFDM system is developed by splitting the incoming data stream into a number of low rate 
channels (N=4) that are stacked in frequency and separated by 1/ symbol rate. The baseline 
configuration of the system supports the OC-12 data rate of 622 Mbps. To achieve an efficient 
implementation, the combination of DFT and IDFT for frequency translations, and polyphase 
and CSD filters for pulse shaping are used. The four-channel approach enables the 
implementation of a system that can be both power and bandwidth efficient, yet enough 
parallelism exists to meet higher data rate goals. 
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Figure 3- 622 Mbps OFDM Modem System 
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